Self-Supervised Siamese Learning on Stereo Image Pairs for Depth Estimation in Robotic Surgery
نویسندگان
چکیده
INTRODUCTION Robotic surgery has become a powerful tool for performing minimally invasive procedures, providing advantages in dexterity, precision, and 3D vision, over traditional surgery. One popular robotic system is the da Vinci surgical platform, which allows preoperative information to be incorporated into live procedures using Augmented Reality (AR). Scene depth estimation is a prerequisite for AR, as accurate registration requires 3D correspondences between preoperative and intraoperative organ models. In the past decade, there has been much progress on depth estimation for surgical scenes, such as using monocular or binocular laparoscopes [1,2]. More recently, advances in deep learning have enabled depth estimation via Convolutional Neural Networks (CNNs) [3], but training requires a large image dataset with ground truth depths. Inspired by [4], we propose a deep learning framework for surgical scene depth estimation using self-supervision for scalable data acquisition. Our framework consists of an autoencoder for depth prediction, and a differentiable spatial transformer for training the autoencoder on stereo image pairs without ground truth depths. Validation was conducted on stereo videos collected in robotic partial nephrectomy.
منابع مشابه
Fusion of stereo and still monocular depth estimates in a self-supervised learning context
We study how autonomous robots can learn by themselves to improve their depth estimation capability. In particular, we investigate a self-supervised learning setup in which stereo vision depth estimates serve as targets for a convolutional neural network (CNN) that transforms a single still image to a dense depth map. After training, the stereo and mono estimates are fused with a novel fusion m...
متن کاملOn the Importance of Stereo for Accurate Depth Estimation: An Efficient Semi-Supervised Deep Neural Network Approach
We revisit the problem of visual depth estimation in the context of autonomous vehicles. Despite the progress on monocular depth estimation in recent years, we show that the gap between monocular and stereo depth accuracy remains large—a particularly relevant result due to the prevalent reliance upon monocular cameras by vehicles that are expected to be self-driving. We argue that the challenge...
متن کاملStereo Vision Auto-Alignment And The Unsupervised Search For Objects Of Interest With Depth Estimation
Stereo vision is fast becoming a highly investigated area in the domain of image processing. Depth information may be obtained from stereo or multi-vision images for reconstructing objects in 3D based on 2D information. Robotic applications make use of stereo vision for navigation purposes, locking down targets, as well as simulating human-like behaviour. This paper presents an algorithm for th...
متن کاملSupervised Traversability Learning for Robot Navigation
This work presents a machine learning method for terrain’s traversability classification. Stereo vision is used to provide the depth map of the scene. Then, a v-disparity image calculation and processing step extracts suitable features about the scene’s characteristics. The resulting data are used as input for the training of a support vector machine (SVM). The evaluation of the traversability ...
متن کاملPyramid Stereo Matching Network
Recent work has shown that depth estimation from a stereo pair of images can be formulated as a supervised learning task to be resolved with convolutional neural networks (CNNs). However, current architectures rely on patch-based Siamese networks, lacking the means to exploit context information for finding correspondence in illposed regions. To tackle this problem, we propose PSMNet, a pyramid...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1705.08260 شماره
صفحات -
تاریخ انتشار 2017